# Memory-efficient inference

Qwen3 30B A6B 16 Extreme GGUF
An ultra-low bit quantization model generated based on Qwen/Qwen3-30B-A3B-Base, supporting a 32k context length and suitable for various hardware environments
Large Language Model Transformers
Q
Mungert
1,321
1
Phi 2 GGUF
MIT
phi-2 is a text generation model employing IQ-DynamicGate ultra-low bit quantization (1-2 bits), suitable for natural language processing and code generation tasks.
Large Language Model Supports Multiple Languages
P
Mungert
472
2
Granite 3.3 8b Instruct GGUF
Apache-2.0
Ultra-low-bit quantization (1-2 bits) language model using IQ-DynamicGate technology, suitable for memory-constrained environments
Large Language Model
G
Mungert
759
2
Qwq 32B GGUF
Apache-2.0
Ultra-low-bit quantization (1-2 bit) large language model using IQ-DynamicGate technology, supporting multilingual text generation tasks
Large Language Model English
Q
Mungert
5,770
17
Olympiccoder 32B GGUF
Apache-2.0
OlympicCoder-32B is a code generation model based on Qwen2.5-Coder-32B-Instruct, employing IQ-DynamicGate ultra-low-bit quantization technology for efficient inference in memory-constrained environments.
Large Language Model English
O
Mungert
361
3
EXAONE Deep 32B GGUF
Other
EXAONE-Deep-32B is a 32B-parameter large language model supporting English and Korean, specifically designed for text generation tasks.
Large Language Model Supports Multiple Languages
E
Mungert
2,249
3
EXAONE Deep 7.8B GGUF
Other
A 7.8B-parameter model featuring ultra-low-bit quantization (1-2 bits) using IQ-DynamicGate technology, supporting English and Korean text generation tasks.
Large Language Model Supports Multiple Languages
E
Mungert
1,791
5
Qwen2.5 14B Instruct 1M GGUF
Apache-2.0
Qwen2.5-14B-Instruct-1M is an instruction-tuned model based on Qwen2.5-14B, supporting text generation tasks and suitable for chat scenarios.
Large Language Model English
Q
Mungert
1,600
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase